Recursive Sketching For Frequency Moments
نویسندگان
چکیده
In a ground-breaking paper, Indyk and Woodruff (STOC 05) showed how to compute Fk (for k > 2) in space complexity O(poly-log(n,m) · n 2 k ), which is optimal up to (large) poly-logarithmic factors in n and m, where m is the length of the stream and n is the upper bound on the number of distinct elements in a stream. The best known lower bound for large moments is Ω(log(n)n 2 k ). A follow-up work of Bhuvanagiri, Ganguly, Kesh and Saha (SODA 2006) reduced the poly-logarithmic factors of Indyk and Woodruff to O(log(m)·(log n+logm)·n 2 k ). Further reduction of poly-log factors has been an elusive goal since 2006, when Indyk and Woodruff method seemed to hit a natural “barrier.” Using our simple recursive sketch, we provide a different yet simple approach to obtain a O(log(m) log(nm) · (log log n) · n 2 k ) algorithm for constant ǫ (our bound is, in fact, somewhat stronger, where the (log logn) term can be replaced by any constant number of log iterations instead of just two or three, thus approaching logn. Our bound also works for non-constant ǫ (for details see the body of the paper). Further, our algorithm requires only 4-wise independence, in contrast to existing methods that use pseudo-random generators for computing large frequency moments.
منابع مشابه
On the Multiplicative Zagreb Indices of Bucket Recursive Trees
Bucket recursive trees are an interesting and natural generalization of ordinary recursive trees and have a connection to mathematical chemistry. In this paper, we give the lower and upper bounds for the moment generating function and moments of the multiplicative Zagreb indices in a randomly chosen bucket recursive tree of size $n$ with maximal bucket size $bgeq1$. Also, we consi...
متن کاملThe Subtree Size Profile of Bucket Recursive Trees
Kazemi (2014) introduced a new version of bucket recursive trees as another generalization of recursive trees where buckets have variable capacities. In this paper, we get the $p$-th factorial moments of the random variable $S_{n,1}$ which counts the number of subtrees size-1 profile (leaves) and show a phase change of this random variable. These can be obtained by solving a first order partial...
متن کاملSketching the order of events
We introduce features for massive data streams. These stream features can be thought of as “ordered moments” and generalize stream sketches from “moments of order one” to “ordered moments of arbitrary order”. In analogy to classic moments, they have theoretical guarantees such as universality that are important for learning algorithms.
متن کاملA Recursive Approximation Approach of non-iid Lognormal Random Variables Summation in Cellular Systems
Co-channel interference is a major factor in limiting the capacity and link quality in cellular communications. As the co-channel interference is modeled by lognormal distribution, sum of the co-channel interferences of neighboring cells is represented by the sum of lognormal Random Variables (RVs) which has no closed-form expression. Assuming independent, identically distributed (iid) RVs, the...
متن کاملLinear Sketching over $\mathbb F_2$
We initiate a systematic study of linear sketching over F2. For a given Boolean function f : {0, 1}n → {0, 1} a randomized F2-sketch is a distributionM over d×nmatrices with elements over F2 such that Mx suffices for computing f(x) with high probability. We study a connection between F2-sketching and a two-player one-way communication game for the corresponding XOR-function. Our results show th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1011.2571 شماره
صفحات -
تاریخ انتشار 2010